Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kathleen R. McKeown

Columbia University

Using the Annotated Bibliography as a Resource for Indicative Summarization

Jun 04, 2002

Min-Yen Kan, Judith L. Klavans, Kathleen R. McKeown

Figure 1 for Using the Annotated Bibliography as a Resource for Indicative Summarization

Figure 2 for Using the Annotated Bibliography as a Resource for Indicative Summarization

Figure 3 for Using the Annotated Bibliography as a Resource for Indicative Summarization

Figure 4 for Using the Annotated Bibliography as a Resource for Indicative Summarization

Abstract:We report on a language resource consisting of 2000 annotated bibliography entries, which is being analyzed as part of our research on indicative document summarization. We show how annotated bibliographies cover certain aspects of summarization that have not been well-covered by other summary corpora, and motivate why they constitute an important form to study for information retrieval. We detail our methodology for collecting the corpus, and overview our document feature markup that we introduced to facilitate summary analysis. We present the characteristics of the corpus, methods of collection, and show its use in finding the distribution of types of information included in indicative summaries and their relative ordering within the summaries.

* Proceedings of LREC 2002, Las Palmas, Spain. pp. 1746-1752
* 8 pages, 3 figures

Via

Access Paper or Ask Questions

Applying Natural Language Generation to Indicative Summarization

Jul 16, 2001

Min-Yen Kan, Kathleen R. McKeown, Judith L. Klavans

Figure 1 for Applying Natural Language Generation to Indicative Summarization

Figure 2 for Applying Natural Language Generation to Indicative Summarization

Figure 3 for Applying Natural Language Generation to Indicative Summarization

Figure 4 for Applying Natural Language Generation to Indicative Summarization

Abstract:The task of creating indicative summaries that help a searcher decide whether to read a particular document is a difficult task. This paper examines the indicative summarization task from a generation perspective, by first analyzing its required content via published guidelines and corpus analysis. We show how these summaries can be factored into a set of document features, and how an implemented content planner uses the topicality document feature to create indicative multidocument query-based summaries.

* 8 pages, published in Proc. of 8th European Workshop on NLG

Via

Access Paper or Ask Questions

Resources for Evaluation of Summarization Techniques

Oct 13, 1998

Judith L. Klavans, Kathleen R. McKeown, Min-Yen Kan, Susan Lee

Figure 1 for Resources for Evaluation of Summarization Techniques

Figure 2 for Resources for Evaluation of Summarization Techniques

Abstract:We report on two corpora to be used in the evaluation of component systems for the tasks of (1) linear segmentation of text and (2) summary-directed sentence extraction. We present characteristics of the corpora, methods used in the collection of user judgments, and an overview of the application of the corpora to evaluating the component system. Finally, we discuss the problems and issues with construction of the test set which apply broadly to the construction of evaluation resources for language technologies.

* in Proc. of First International Conference on Language Resources and Evaluation, Rubio, Gallardo, Castro, and Tejada (eds.), Granada, Spain, 1998
* LaTeX source, 5 pages, US Letter, uses lrec98.sty

Via

Access Paper or Ask Questions

Linear Segmentation and Segment Significance

Sep 15, 1998

Min-Yen Kan, Judith L. Klavans, Kathleen R. McKeown

Figure 1 for Linear Segmentation and Segment Significance

Figure 2 for Linear Segmentation and Segment Significance

Figure 3 for Linear Segmentation and Segment Significance

Figure 4 for Linear Segmentation and Segment Significance

Abstract:We present a new method for discovering a segmental discourse structure of a document while categorizing segment function. We demonstrate how retrieval of noun phrases and pronominal forms, along with a zero-sum weighting scheme, determines topicalized segmentation. Futhermore, we use term distribution to aid in identifying the role that the segment performs in the document. Finally, we present results of evaluation in terms of precision and recall which surpass earlier approaches.

* Proceedings of 6th International Workshop of Very Large Corpora (WVLC-6), Montreal, Quebec, Canada: Aug. 1998. pp. 197-205
* 9 pages, US Letter, 4 figures. Software License can be found at http://www.cs.columbia.edu/nlp/licenses/segmenterLicenseDownload.html

Via

Access Paper or Ask Questions

Building a Generation Knowledge Source using Internet-Accessible Newswire

Feb 25, 1997

Dragomir R. Radev, Kathleen R. McKeown

Figure 1 for Building a Generation Knowledge Source using Internet-Accessible Newswire

Figure 2 for Building a Generation Knowledge Source using Internet-Accessible Newswire

Figure 3 for Building a Generation Knowledge Source using Internet-Accessible Newswire

Figure 4 for Building a Generation Knowledge Source using Internet-Accessible Newswire

Abstract:In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which uses a client-server architecture to extract noun-phrase descriptions of entities such as people, places, and organizations. The system serves two purposes: as an information extraction tool, it allows users to search for textual descriptions of entities; as a utility to generate functional descriptions (FD), it is used in a functional-unification based generation system. We present an evaluation of the approach and its applications to natural language generation and summarization.

* To appear in Proceedings of the 5th Conference on Applied Natural Processing, Washington DC, 31 March - 3 April, 1997.
* 8 pages, uses epsf

Via

Access Paper or Ask Questions

Gathering Statistics to Aspectually Classify Sentences with a Genetic Algorithm

Oct 21, 1996

Eric V. Siegel, Kathleen R. McKeown

Abstract:This paper presents a method for large corpus analysis to semantically classify an entire clause. In particular, we use cooccurrence statistics among similar clauses to determine the aspectual class of an input clause. The process examines linguistic features of clauses that are relevant to aspectual classification. A genetic algorithm determines what combinations of linguistic features to use for this task.

* postscript, 9 pages, Proceedings of the Second International Conference on New Methods in Language Processing, Oflazer and Somers ed.

Via

Access Paper or Ask Questions

Emergent Linguistic Rules from Inducing Decision Trees: Disambiguating Discourse Clue Words

Aug 13, 1994

Eric V. Siegel, Kathleen R. McKeown

Abstract:We apply decision tree induction to the problem of discourse clue word sense disambiguation with a genetic algorithm. The automatic partitioning of the training set which is intrinsic to decision tree induction gives rise to linguistically viable rules.

* AAAI94 proceedings

Via

Access Paper or Ask Questions